Detecting the large entries of a sparse covariance matrix in sub-quadratic time

نویسندگان

Ofer Shwartz

Boaz Nadler

چکیده

The covariance matrix of a p-dimensional random variable is a fundamental quantity in data analysis. Given n i.i.d. observations, it is typically estimated by the sample covariance matrix, at a computational cost of O(np2) operations. When n, p are large, this computation may be prohibitively slow. Moreover, in several contemporary applications, the population matrix is approximately sparse, and only its few large entries are of interest. This raises the following question: Assuming approximate sparsity of the covariance matrix, can its large entries be detected much faster, say in sub-quadratic time, without explicitly computing all its p2 entries? In this paper, we present and theoretically analyze two randomized algorithms that detect the large entries of an approximately sparse sample covariance matrix using only O(np poly log p) operations. Furthermore, assuming sparsity of the population matrix, we derive sufficient conditions on the underlying random variable and on the number of samples n, for the sample covariance matrix to satisfy our approximate sparsity requirements. Finally, we illustrate the performance of our algorithms via several simulations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Hanson-Wright inequalities for subgaussian quadratic forms

In this paper, we provide a proof for the Hanson-Wright inequalities for sparse quadratic forms in subgaussian random variables. This provides useful concentration inequalities for sparse subgaussian random vectors in two ways. Let X = (X1, . . . , Xm) ∈ R be a random vector with independent subgaussian components, and ξ = (ξ1, . . . , ξm) ∈ {0, 1} be independent Bernoulli random variables. We ...

متن کامل

Chained Vector Simplex

An algorithm for solving linear programming problems whose matrix of coefficients contains a large number of "zero" entries is studied. This algorithm is more useful when it is generated as a sub-program in a real-time program. The singly linked lists for storing only the non-zero entries of the coefficients matrix is used. The modified Revised Simplex Method is also used for solving such probl...

متن کامل

A Well-Conditioned and Sparse Estimation of Covariance and Inverse Covariance Matrices Using a Joint Penalty

We develop a method for estimating well-conditioned and sparse covariance and inverse covariance matrices from a sample of vectors drawn from a sub-Gaussian distribution in high dimensional setting. The proposed estimators are obtained by minimizing the quadratic loss function and joint penalty of `1 norm and variance of its eigenvalues. In contrast to some of the existing methods of covariance...

متن کامل

Covariance Matrix Estimation for Stationary Time Series

We obtain a sharp convergence rate for banded covariance matrix estimates of stationary processes. A precise order of magnitude is derived for spectral radius of sample covariance matrices. We also consider a thresholded covariance matrix estimator that can better characterize sparsity if the true covariance matrix is sparse. As our main tool, we implement Toeplitz [Math. Ann. 70 (1911) 351–376...

متن کامل

JPEN Estimation of Covariance and Inverse Covariance Matrix A Well-Conditioned and Sparse Estimation of Covariance and Inverse Covariance Matrices Using a Joint Penalty

We develop a method for estimating well-conditioned and sparse covariance and inverse covariance matrices from a sample of vectors drawn from a sub-gaussian distribution in high dimensional setting. The proposed estimators are obtained by minimizing the quadratic loss function and joint penalty of `1 norm and variance of its eigenvalues. In contrast to some of the existing methods of covariance...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1505.03001 شماره

صفحات -

تاریخ انتشار 2015

Detecting the large entries of a sparse covariance matrix in sub-quadratic time

نویسندگان

چکیده

منابع مشابه

Sparse Hanson-Wright inequalities for subgaussian quadratic forms

Chained Vector Simplex

A Well-Conditioned and Sparse Estimation of Covariance and Inverse Covariance Matrices Using a Joint Penalty

Covariance Matrix Estimation for Stationary Time Series

JPEN Estimation of Covariance and Inverse Covariance Matrix A Well-Conditioned and Sparse Estimation of Covariance and Inverse Covariance Matrices Using a Joint Penalty

عنوان ژورنال:

اشتراک گذاری